A document binarization method based on connected operators
نویسندگان
چکیده
0167-8655/$ see front matter 2010 Elsevier B.V. A doi:10.1016/j.patrec.2010.04.003 * Corresponding author. Tel.: +33 0 3 29 53 60 41; E-mail addresses: [email protected] (B. Naege paris5.fr (L. Wendling). An original binarization method based on connected operators is proposed in this paper. Connected operators enable to filter and/or segment an image by preserving its contours. The proposed binarization method enables to extract relevant document objects by means of the component-tree structure. This strategy was compared to other binarization methods and showed good behavior in various contexts. 2010 Elsevier B.V. All rights reserved.
منابع مشابه
Binarization of color document images via luminance and saturation color features
This paper presents a novel binarization algorithm for color document images. Conventional thresholding methods do not produce satisfactory binarization results for documents with close or mixed foreground colors and background colors. Initially, statistical image features are extracted from the luminance distribution. Then, a decision-tree based binarization method is proposed, which selects v...
متن کاملAncient Document Images Enhancement Using Phase Based Binarization
In this paper, we present a phase-based binarization model for degraded document images, also a post processing method that can improve any binarization method and a ground truth generation tool. Usually, many binarization techniques are implemented in the literature for different types of binarization problems. It include an adaptive image contrast based document image binarization technique t...
متن کاملDegraded Document Image Binarization Using Optical Character Recognition
The proposed OCR algorithm to retrieve the text in the scanned document images. Here the text detection algorithm based on two machine learning classifiers: one allows generating candidate word regions and the other filters out non-text ones. The extract connected components (CCs) in images by using the maximally stable extremal region algorithm. In CC clustering adaboost classifiers are used t...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 31 شماره
صفحات -
تاریخ انتشار 2010